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User Partitioning for Less Overhead 
in MIMO Interference Channels 
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Abstract 



o 

This paper presents a study on multiple-antenna interference channels, accounting for general over- 
head as a function of the number of users and antennas in the network. The model includes both perfect 



and imperfect channel state information based on channel estimation in the presence of noise. Three low- 
complexity methods are proposed for reducing the impact of overhead in the sum network throughput 
by partitioning users into orthogonal groups. The first method allocates spectrum to the groups equally, 



^ creating an imbalance in the sum rate of each group. The second proposed method allocates spectrum 

unequally among the groups to provide rate fairness. Finally, geographic grouping is proposed for cases 
where some receivers do not observe significant interference from other transmitters. For each partitioning 
method, the optimal solution not only requires a brute force search over all possible partitions, but also 
requires full channel state information, thereby defeating the purpose of partitioning. We therefore propose 



in 

q 

(^. greedy methods to solve the problems, requiring no instantaneous channel knowledge. Simulations show 

o 

that the proposed greedy methods switch from time-division to interference alignment as the coherence 



time of the channel increases, and have a small loss relative to optimal partitioning only at moderate 
coherence times. 

I. Introduction 

Interference channels model the case of simultaneous point-to-point transmission by two or more 
transmitters that do not have mutual knowledge of transmitted data for the purposes of coordinated 
precoding. Recent work on interference channels has shown that, theoretically, the capacity of such 
networks increases linearly with the number of transmit/receive pairs in the network (TJ, @- In particular, 
by intelligently precoding the transmitted symbols, all the interference can be forced into a subspace of the 
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received space at all receivers simultaneously. This precoding operation is called interference alignment 
(IA). Although IA can achieve a linear rate scaling with the number of users in a network, achieving 
the optimal scaling requires network channel state information (CSI) when designing the precoders. In 
particular, with only two users, previous work has shown a loss in capacity scaling with signal-to-noise 
ratio (SNR) when channel coefficients are not known at the transmitters J3j, (4}. Other work has studied 
IA with statistical channel state information [5] or for other channel models [6]. Iterative algorithms have 
been proposed that can run in a distributed fashion requiring only local channel state information at each 
node (7J _ l|2j- Such algorithms trade feedback overhead for the overhead of iterating over the wireless 
medium. Previous work has shown that the number of total feedback bits for interference alignment 
scales as the square of the number of users in the network [ [TO} . This is because the total number of 
wireless links grows with the square of the number of users in the network. CSI at the transmitter can 
also be obtained through reciprocity, which requires calibration [11]. Such a procedure trades feedback 
overhead for calibration and extra training overhead. 

Beyond the requirement of CSI when designing precoders, there is no prior work analyzing the 
interference channel without channel state knowledge at the receivers. All current methods for maximizing 
degrees of freedom (DOF, the pre-log factor in the sum capacity term related to the total number of spatial 
streams in the network) for the interference channel require channel training and estimation at each node 
even if no feedback mechanism is employed. The requirement of CSI, even if only at the receivers, may 
still dominate communication in an interference channel with many users, since training is known to 



effectively reduce the degrees of freedom of a point-to-point link |12|. With low-to-moderate coherence 
times, the training required to estimate the K 2 wireless channels in a K-user MIMO interference channel 
can last nearly as long as the coherence time, leaving a very short amount of time for IA transmission 
before the CSI becomes stale. Time and frequency synchronization among all nodes is also required for 
interference alignment adding to the overhead burden of the network. 

To mitigate the domination of scaling overhead in large interference channels, prior work has considered 



clustering in a cellular network based on spatial proximity [13|, [14J, but this clustering is done without 



optimization and does not consider overhead. Others have considered the impact of imperfect CSI on 



the achievable sum rate of interference alignment 1 15 1, but considered only the case where all links have 



the same channel estimation error. The number of bits of limited feedback desired for single-antenna 
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interference alignment was investigated in [10]. Overhead due to training was neglected in both cases. 

Interference alignment-type transmitters with no transmit CSI have recently been proposed Q, jS), 
resulting in reduced network degrees of freedom. Such work makes the assumption that the network 
is operating in an environment where training and feedback overhead will dominate, and the total IA 
throughput will be smaller than a suboptimal strategy with no feedback. This assumption is valid in quickly 
varying channels. The overhead conditions are not quantified. This paper makes no such assumption and 
instead addresses the question, "how much overhead makes IA infeasible?" The question has not been 
addressed in the literature, and its answer is unclear. With very static channels, we can dedicate long 
training sequences to generate high-fidelity training estimates that will be accurate for a long period 
of time. Further, with quickly varying channels, obtaining a large amount of channel information is 
infeasible. For all the cases between these two extremes, the overhead must be quantified. 

In this paper we account for overhead in MIMO interference channels through an overhead penalty 
factor on the sum throughput. The model assumes synchronized narrowband block fading with overhead 
requiring access to the wireless medium at the beginning of each frame. Using this model we show that 
the achieved sum rate with overhead of interference alignment will go to zero with a large number of 
users, even if the only overhead in the network is due to training. That is, even with a minimal amount 
of overhead (minimum training lengths, no feedback, no synchronization overhead, no medium access 
control overhead, etc.), IA does not have asymptotically increasing sum rate as the number of users 
grows large. We then show that, if the overhead grows faster than linearly with the number of users 
in the network, partitioning the network into orthogonally transmitting groups can increase the effective 
degrees of freedom. The rest of the paper is devoted to developing smart partitioning methods. 

First, we consider a connected interference channel, where spatial clustering is ineffective because of the 
proximity of all users. We derive an optimization to maximize the sum rate when each group is allocated 
an equal amount of transmission time, and the solution to this optimization is shown to be too complex 
to serve its purpose, requiring global channel state information and comprehensive search. We therefore 
propose a greedy algorithm that requires only large scale information (i.e., channel magnitude) on the link 
between each transmit/receive pair, but not for the interfering links. The availability of such information is 
justified because it is likely to be correlated across channel realizations. Based on an approximation to the 
sum rate for interference alignment using linear precoding, the proposed algorithm efficiently partitions 
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the network into IA groups. Relative to our previous work [16], this paper introduces new partitioning 
algorithms, proposes geographical and equal-rate grouping, and includes analysis on training length. 

Second, we derive an equal-rate unequal-time allocation between groups to enforce sum-rate-fairness 
rather than time-slot-length fairness. This algorithm is shown to require a small modification to the 
equal-time allocation algorithm and an additional final step solving a linear system of equations. This 
solution is again based on a connected interference channel where spatial clustering is not beneficial. In 
an unconnected network, grouping together users that are geographically separated may allow them to 
transmit nearly orthogonal in space with higher throughput due to significant path loss from interfering 
transmitters. Conversely, a network can be partitioned into groups that are nearly mutually orthogonal in 
space, such that the groups can transmit simultaneously (rather than the users transmitting simultaneously 
while groups transmit orthogonally). Finally, we derive greedy algorithms for both of these scenarios based 
on position information obtained through GPS or similar positioning methods. The spatial clustering 



algorithms are well-suited for dense ad hoc networks |15|, |17|, [18|, where a natural spatial clustering 
may not be present or is distorted because of overhead. Assuming the existence of an IA-enabling 
mechanism built into the network, these algorithms require no additional overhead. 

In summary, this paper proposes a suite of transmission strategies, and a method for choosing among 
them, that trade increased overhead for increased capacity, or decreased capacity for decreased overhead. 
The strategies presented here are parameterized by a single scalar parameter, the number of groups with 
which to partition the network. The most complex strategy considered is interference alignment through 
the entire network; the simplest strategy considered is time division multiple access (TDMA) across the 
entire network. By partitioning the network into groups that transmit mutually orthogonally, but using 
IA inside the groups, the gap between IA and TDMA is filled using very little network knowledge 
and processing. Previous work on grouping, for instance for network MIMO [19] and interference 



alignment |13|, was performed with the overall goal of trading overhead and rate without explicitly 
taking overhead into account. Previous efforts to reduce the overhead of IA transmissions, including [7], 
fT0] l, assume that all the users are using IA simultaneously, which this paper shows is often suboptimal. 
Finally, previous work on imperfect channel state information in interference channels finds rate bounds 
but does not optimize these rates as a function of length of the training, as this paper studies. 



This paper is organized as follows: Section ftU presents the model utilized in this paper; Section III 
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discusses the problem of partitioning in general and shows why optimal partitioning is impractical; 



Section IV proposes greedy algorithms for partitioning the network based on equal time allocation, equal 



sum rate allocation, and geographic nearness; Section [V] analyzes the relationship between partitioning 



and training overhead; Section VI presents computational simulations while Section VII concludes the 
paper and points toward future work. 

Finally, a word on notation. The log refers to log 2 . Bold uppercase letters, such as A, denote matrices, 
bold lowercase letters, such as a, denote column vectors, and normal letters a denote scalars. The letter 
E denotes expectation, C is the complex field, max{a, b} denotes the maximum of a and b, \\A\\p is the 
Frobenius norm of matrix A, and | A| is the determinant of square matrix A. The empty set is denoted 
as 0, the identity matrix of appropriate dimension is I, and IaxB is the Ax B truncated identity matrix. 

II. System Model 

We consider a distributed MIMO network with 2K nodes. K of the nodes have data to transmit 
via their N t antennas to the other K nodes, each with N r antennas, with no multicasting or cooperative 
transmission. Transmitter k 6 {1, . . . , K} has data destined only for receiver k. We assume a narrowband 
block fading model where the N r x Nt matrix channel Hk,£ between transmitter £ S {1,...,K} 
and receiver k 6 {1,...,K} is independently generated every T symbol periods Vk,£. We assume 
transmissions are frame and frequency synchronous. Thus, at any fixed moment in time, there is a K- 
user MIMO interference channel with Nt antennas at each transmitter and N r antennas at each receiver, 
as illustrated in Figure [1] We consider scenarios where interference alignment is considered to be 
theoretically amenable; that is, we consider channels in which, without overhead, IA would be a good 
candidate transmission strategy, with strong channels between all nodes. In scenarios where IA is not 
desirable, such as when interference is much stronger than the signal, receiver methods such as successive 



interference cancellation (SIC) may be more attractive |20|. The assumption that all nodes have identical 
coherence times is justified because of previous work showing that multiuser transmission is severely 
degraded in quickly changing channels (2TJ, [22|, meaning all candidates for interference alignment are 



likely to have relatively static channels. Analysis for different coherence times for each link is left for 
future work. 

Communication is divided into frames of period T symbols, as shown in Figure [2] The beginning of 
each frame is devoted to overhead, which may include training, feedback, synchronization, higher layer 

February 9, 2012 DRAFT 



overhead, and so on. We do not make assumptions about the source or amount of overhead. Later we 
will explicitly model channel training and estimation, but this will not preclude the existence of other 
overhead sources. For channel estimation, the transmitters send mutually orthogonal training sequences 
since the network is connected (i.e., spatially dense). This training is necessary not only for coherent 
detection but also for CSI feedback required to exploit the full degrees of freedom in the network (3J, 
Q. Although reciprocity can be exploited (TJ, it requires double the training and a special calibration 



procedure among all the nodes in the network [ 11 1. Overhead time is C(K, N t ,N r ) < T symbol periods. 



Thus overhead requires a fraction a = mm{C(K, Nt, N r )/T, 1} of the frame, while data is transmitted 



during the remaining a = 1 — a. This overhead model is an extention of the model in 1 12] to the MIMO 
interference channel. 

The data transmission portion of the frame begins after the first C(K, N t ,N r ) symbols and ends when 
the channel changes T transmissions later. Information theoretic results, which neglect overhead, suggest 
that all transmitters should send aligned signals simultaneously to achieve the maximum degrees of 
freedom in the channel and thus approach its sum capacity with high transmit power JlJ, (2J. The overhead 
portion of the frame has given the transmitters sufficient information to design linear precoders. While 
linear precoding may not be sum-rate-optimal [2], it is a practical approach for immediate implementation 
because of the simplicity of the receiver signal processing. Transmitter £ sends Se spatial streams to 
receiver £. At symbol period n, the signal observed by receiver k E {1, . . . , K} is 

K 

Yk N = V^fcH fcifc F fc s fc [n] + ^ y/p^H. k JE(Si [n] + v k [n] , (1) 

where p k ^ = E k j k> £, E k is the transmit power from transmitter k, j k> £ is the fading coefficient from 
transmitter k to receiver £, H k ,e is the N r x Nt MIMO channel from transmitter k to receiver £, Fg 
is the Nt x Si unit-norm linear precoder used at transmitter £, sp is the S<xl vector of symbols sent 
by transmitter £, and v^ is zero-mean white circularly symmetric zero-mean complex Gaussian noise 
with covariance matrix IEv^v^ = R fc . The rest of the paper examines the implications of overhead as a 
function of the number of users and proposes methods to find a balance between overhead and capacity 
gains. 
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III. Optimal Partitioning to Reduce Overhead 

This section introduces and motivates the notion of network partitioning to reduce overhead. We first 
consider the case of maximizing the sum rate of a network with perfect channel estimation. The model 
described in Section [II] is a K -user MIMO interference channel during the data portion of the frame. 
Assuming the training performed in the first part of the frame results in perfect CSI at both transmitter 
and receiver, with the overhead model described in Section [II] and maximum likelihood reception, the 
sum rate of the network in bits per transmission for a particular frame is then 



-'tsiiTn — Oi 



K / * + \~1 

k=l ^ l^k ' 



(2) 



When all the transmitters are communicating during the data portion of the frame, the effective 
throughput is reduced by a factor of a relative to the information-theoretic sum rate. The reduction 
factor a is a function of the number of symbols required for overhead and the coherence time of 
the channel. Overhead includes symbols required for training, feedback, synchronization, or any other 
spectrum utilization not used for communication of data. It is thus a function of the number of users in 
the channel and the number of antennas at each node. 

Our claim is that, if the overhead in the network scales faster than linearly with the number of users in 
the network, then the sum rate of the network may be increased through partitioning. Figure [3] illustrates 
the concept of partitioning. Instead of all the transmitters sending simultaneously throughout the data 
portion of the frame, the frame is divided into P sub-frames, each with an overhead and data portion. If 
overhead does scale faster than linearly with K, then splitting the interference channel into P equally- 
sized interference channels utilizing the spectrum equally but orthogonally will reduce overhead. That is, 
if P > 1, 

(TjP - C(K/P, N t , N r )\ T- PC(K/P, N t , N r ) 



P 



( T/P-C(K/P,N t ,N r ) \ 



T 

> T-C(K,N t ,N r ) ^ (3) 



Previous work has shown that feedback overhead for IA scales with the square of the number of users [10]. 
A measurement study of a network not even performing coordinated transmissions found that overhead 
scaled faster than linearly with the number of users [23]. Orthogonalization thus has significant potential 
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to improve the effective sum rate by reducing total network overhead. 

Since the capacity of IA is known to increase with the number of users K (TJ, Q, 1 24 1, forcing all users 



to transmit orthogonally (time division multiple access, TDMA) is not optimal in general, though in some 
cases it may be. We therefore propose a suite of transmission strategies, parameterized by the number 
of orthogonal groups P, spanning complexity and overhead from interference alignment to TDMA, as 
illustrated in Figure [4] That is, for P = 1, all the users are transmitting simultaneously using IA, and 
with P = K, the users are transmitting orthogonally in time-division fashion. For 1 < P < K, the 
network is using a hybrid of the two techniques. 

Note that since the original K users were modeled as a connected interference channel, where all 
receivers observe a signal from all transmitters above the noise floor, any subset of transmit/receive pairs, 
in isolation, may also be modeled as a connected interference channel. The interference channel can be 
modeled as a connected graph [25]. A vertex v^ would include both the transmitter and receiver for user 
pair k. The cost of each edge could be the signal-to-noise ratio from the transmitter in one vertex to the 
receiver in the other vertex. In this model, the edge cost is assumed to be reciprocal, though this does 
not imply that the channel is reciprocal. The weight associated with each vertex is the signal-to-noise 
ratio from the transmitter in one vertex to the receiver in the same vertex. 



Graph partitioning is an important, well-studied problem in combinatorial optimization [26]. Standard 
graph partitioning methods, however, are not directly applicable to the problem considered in this paper. 
The main reason is that overhead is difficult to incorporate into the graph model. That is, the sum weight 
of a group will depend on how many vertices are assigned to the group, which is not reflected in the 
static weight/cost model. In a non-connected interference channel, where some receivers do not observe 
interference from some transmitters, graph partitioning can be directly applied to produce non-orthogonal 



groups that attempt to transmit IA at the same time. This is described in more detail in Section IV-C 
We thus develop novel methods for the partitioning desired in our network model. 

If users in the interference channel are partitioned into P index sets {JC P }, with K p = \JC P \ users in 
the pth group, then the sum rate of the network becomes 
p 



-Rsum = 2^ OL p 2_^ log I + ( Rfc + 2_^ Pk/Hk/FiFlHl^ j /9fc i fcHfc ) fcF / i c F|.H^ 



p=i fcefcp 



eeK. 



(4) 
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where 

_ T/P-C(K p ,N t ,N r 



(5) 

This extension of ([2]) sums the rate of each point-to-point MIMO link inside each group (k G /C p ), and 
over all groups (p £ {1, . . . , P}), where only users in the same group interfere with each other. We then 
aim to solve the following optimization: 

maximize -Rsum 

with respect to PeNi.^G NiVp, F £ G C NtXS 'W 

subject to £j =1 K p = if, ||F/|| < 1. (6) 

The solution to this optimization is computationally complex and involves not only a brute force search 
over every possible grouping, but also the calculation of the desired precoders for each grouping. 
Neglecting the precoder calculations, and assuming that we have a priori knowledge that the optimal 
partition is to equally distributed users across groupqj the number of searches required is still 1 26 1 

K i ,. \ / K _ K / P \ l K / P 




P- S IS II • ' (7) 

' l '" ;o f l K/P J \ K/P 

Further, such an optimization requires each link Hk,e to be trained and estimated, negating the overhead 
reduction that partitioning provides. Obviously this is not a practical way to optimize overhead in 
interference networks. In the next section we present a greedy method for performing channel partitioning 
with only channel quality information. 

IV. Greedy Partitioning 



The sum-rate-optimal partition was shown at the end of Section III to be too complex for imple- 
mentation. We thus turn to heuristic approaches to reduce not only computational complexity but also 
the amount of network knowledge required for implementation. We first develop a greedy method of 
partitioning the network where each group is allocated the same amount of time for transmission. We 

'This assumption is a good approximation in most cases, but is not optimal in every case. Not making this assumption greatly 
increases the search complexity even further. 
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then develop a method for allocating time in an unbalanced fashion to make each group's sum rate equal. 
Lastly we consider geographic partitioning methods that can exploit an unconnected interference channel. 
For the following algorithms we assume a network mechanism exists to allow IA transmissions 
simultaneously from all transmitters if needed. Such a mechanism can be a central controller or a 
distributed protocol. The partitioning can be piggy-backed onto this mechanism, as illustrated in Figure [5] 
with no additional communications overhead, either through a wired backbone or the wireless medium. 

A. Balanced Time Allocation 

To develop a greedy algorithm for partitioning the network, we must first define a selection function 
that assigns a value of placing a user in a group. This function would ideally be the sum rate increase 
of placing a user in a group. This is difficult in multiuser networks since the actual sum rate increase 
will depend on which future users are assigned to the group — knowledge that is unavailable in a greedy 
algorithm, which makes the locally optimum choice at each step without global knowledge. Instead we 
resort to an approximation of this sum rate increase. 

After partitioning the A'-user interference channel into P orthogonal groups, group p will be a K p -user 
interference channel that is restricted to utilizing only \/P of the spectrum or coherence interval. This 
enforces a time-sharing fairness constraint while attempting to maximize sum throughput for the entire 
frame. An equal-rate -per-group design, which involves unbalanced time allocations, will be investigated 



in Section IV-B Thus, interference alignment is a reasonable choice for precoder design in each group. 
Although interference alignment requires extensive CSI and calculation of precoders to find the exact sum 
rate, we note that the precoder solutions are independent from the direct links {Hfcfc},VA;. Thus, with 
interference alignment, the expected throughput will be approximately the rate obtained from randomly 
generating orthogonal precoders Q and combiners $ of correct rank drawn uniformly from the Grassmann 
manifold in the absence of interferers because of our lack of knowledge of the channel state affecting 
the precoders and combiners. We then approximate the expected rate for user k in group p to be 



Rk, P ~ afc, p IE^Q log 



I+^**H fefe QQ*H^* 

Jk 



(8) 



where the scaling factor Pk.k/Sk is for power normalization and a^p = (T/P — C{K p + 1, N t , N T ))/T. 
The expectation in ([8]) is an approximation because we draw Q and <& independently, whereas actual IA 
precoders and combiners are not mutually independent. We then let Q = Q[/Ijv f x s k an d * = &u~f-N r xS k > 
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where Qu and 3>r/ are random unitary matrices of appropriate dimension and IaxB is the A x B truncated 
identity matrix. Defining H^ = $^Hfc fcQir, then 



1 + -^- J -N r .xS k H -k,klN t xS k lN t xS k ^ l k,k l NrxS h 
Ok 



Rk, P « <xk, p E& U} Q v log 
Then, defining the matrix H^ = ijy xS Hfc,fcIjv r xS fc > y\ becomes 

Rk, P ~ a k , p E^ Ut Q u \og 



(9) 



Ok 



Oikj^^u.Qu log 

5,, 



1 + 



Pk,kf,2 

~d~^k 
Ok 



El Pk k 2 i 



(10) 



where E& is the 5jt x Sfe diagonal matrix of singular values of Hfc Precise calculation of (10) is not 
trivial, so we resort to the bound Ylf=i ^i 1 + °f ) < <S&log(l + (V 5 *) Ef=i °f )• Tnis bound is tight 
when the singular values are roughly equal. Then, ( [TO] ) can be rewritten as 

Rk, P ~ afc, P 5'fcE #[ , i Q Lr log f 1 + -JpHHfell! J . (11) 

Again with no knowledge of the channels {H&/}, k ^ £ on which IA precoder design is based, we resort 
to computing the expectation 



E 



_ Pk,kSl w - 2 _ Pfc.fcSfc i 

* [ ;,Q[/ll rl A||F - Ar Ar J ^*u,Q C 7ll ±1 fc 1 A;||F ~ A7 Ar ll n fc,fc|lF- 



|H, 



AW 



AW 



I H, 



(12) 



Using Jensen's inequality [27 1, we subsitute the right side of { 12 1 into (111 and finally have 



Rk, P ~ a k , p S k log 1 + 



Pk,k 

N t N r 



IH 



k,k\\F 



(13) 



This approximation is justified via the plot in Figure [6] for a 3-user 4-antenna system transmitting 2 
streams per user. Despite the seemingly large number of approximations made in the derivation, the 
estimate is surprisingly tight, especially at moderate-to-high SNR. 



The estimation of ( 13 1 requires K p , N t , N r , d(K p , N t , N r ) (since J2keK. ^k < d(K p , N t ,N r ), and the 



product Pfc,fc||Hfcfc||i?. Knowledge about the number of antennas N t and N r is assumed known a priori, 
and the degrees-of-freedom depends on the transmission strategies available (2j, [24|, which are also 
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known in advance. The channel quality metric /£>& fc||Hfc fc|||i can be estimated from the previous channel 
realization since large scale fading, including path loss and shadowing, is likely to be correlated across 
channel realizations. If /?&,*; HH^H^ is not known exactly, we can substitute Epfcfc[|Hfc&[||. in its place, 
given previous channel measurements. At the beginning of the algorithm, however, K p , p € {1, . . . , P} 
is undefined because the number of groups P are unknown. One could perform the greedy algorithm for 
each possible P and choose the one with the highest sum rate, but this would increase the computational 
complexity of the algorithm by a factor of K. We can instead intelligently choose P based solely on 
a priori knowledge of Nt, N r , T, C(K,Nt, N r ), and d(K,Nt,N r ). In particular, we define degrees of 
freedom with overhead dx(k,Nt,N r ,T) as 

i K (KN t ,N r ,T) = mW^3^1 d[Wr) . (14 ) 

We then choose 

Ko = &Tgm&xdK(k,Nt,N r ,T) (15) 

fceNi 

jf- . This choice of P will be near a good overhead-capacity tradeoff since Ko is 
the DOF-optimal number of users in an Nt x N r interference channel with overhead C(k,Nt,N r ) and 
coherence time T. 

Once P is found, we can assign users to each group by their approximate rate Rk p . The algorithm 
is summarized in Table m The algorithm in Table hi requires P{^2 i= ^ K — i) searches, which grows 
approximately with K 3 assuming P grows linearly with K (P will not grow faster than linearly with K, 
so this is a worst-case analysis). Further, relative to the optimal search, this algorithm does not require 
computation of precoders (which may be an iterative procedure for K > 3), and does not require any of 
the channel coefficients to be trained and estimated. Note that this algorithm is based on a model with 



and set P 



linear precoding, which does not result in a linear relationship between K and d(K,N t ,N r ) [24 1, [28 1. 
This algorithm can work for non-linear precoding [2], which may increase the degrees of freedom in 
a constant-coefficient interference channel, with an appropriate approximation of Rk, P - That problem is 
beyond the scope of this paper. 

Finally, because this is a greedy algorithm, the addition of a user to the network is straightforward 
and efficient. One need only run the algorithm for the new user, with a search complexity of P. After 
several users have joined the network, it will need to be restructured (likely with higher P), but for an 
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incremental change, network topology does not need to change. When a user leaves the network, the 
network can be maintained by re-allocating the user with the worst performance in the network. This 
keeps the groups balanced without having to restructure at every change. Detailed exploration of this 



matter is left for other work |29|. 



B. Sum Rate Fairness 



The algorithm of Section IV-A allotted an equal amount of time in the frame for each group and 
maximized the sum rate under this constraint. Maximizing the sum rate with unbalanced time allocation 
will lead to the group with highest sum rate transmitting for the entire frame. Unbalanced time allocation 
can be used, however, to provide each group with the same sum rate. A disadvantage of such a design 
is that the group with the lowest sum rate is invariably using most of the frame. To mitigate such a 
scenario, we must carefully assign users to groups. 

We first define the estimated sum throughput of group p at any point in the algorithm to be R p = 
SfceiC Rk,P' ^ e then define network disparity for a particular allocation of users as 

p({R p }) = maxRp — R^. (16) 

We then modify Step 5 of the algorithm in Table [I] to be 

{k',p'} = &Tgnunp({R 1 ,...,Rp + R ktP ,...,Rp}). (17) 

k,p 

This modified algorithm will attempt to allocate sum rate equally among all groups. The rate, in general, 
will not be equal even after this algorithm modification, so group transmission times must be allocated 
unequally. This allocation can be done based on the estimated sum rates {R p } or the actual sum rates {R p } 
if performed after all the training, estimation, and feedback for the frame has occurred. For simplicity 
we will use {R p }- If group p is allocated fi p T symbols for transmission (including overhead), then the 
sum rate of the network becomes 

^ ^T-C(K p ,N t ,N r ) 

-n-sum — 2_^i ^p K P ( ^ i °- ) 

p=l 

We constrain V, ji p = 1 and fj, p > 0, Vp. The sum rate of each group is an unknown R*. We can enforce 
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the equal-rate constraint with a set of equations: 



UpRp - R* = a p R p , p € {1, . . . , P - 1}, 



which we can then form into a linear relation 
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(19) 



(20) 



The time allocation vector /i = (p,\, f/,2, ■ ■ ■ , p,p,R*) has a unique solution since the left matrix in (20 1 
is square and non-singular. 



C. Geographic Grouping 



Since the greedy Balanced Time Allocation algorithm proposed in Section IV-A 



estimates its rate 



based only on the SNR between user pairs p^, and neglects inter-user SNRs p^, k ^ £, it does not take 
advantage of possible natural groupings that may arise from geographical clusters. It has been shown 
that IA performs best, relative to other transmission techniques, when all receivers have strong links to 
all transmitters. This is because IA is a degrees-of-freedom-optimal transmission strategy, and degrees- 
of -freedom are most important in the regime where all receivers have strong links to all transmitters. 
Thus, a position- or signal strength-based algorithm could group geographically close users to maximize 
the benefit of IA. Conversely, if non-IA transmissions are considered, a similar algorithm could group 
together users that are geographically separated, choosing to transmit as if no interference existed. Since 
this regime is not "high SNR" in the interference channel sense (some links may have strong power, but 
not all), interference alignment is not the desired transmission strategy, and instead interference can be 
ignored. This section analyzes the latter case, which, as we will show, is algorithmic ally equivalent to 
the first case. 

We study the problem of geographic grouping under time-orthogonal transmissions, still considering 
the overhead model of previous sections. It is assumed that the central controller executing the partitioning 
algorithm has position information for each transmitter and receiver in the network, although this could 
be replaced with a channel quality indicator for the channels between all receivers and transmitters in the 
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network. The position of receiver k is 5^, while the position of transmitter £ is tt£, so that the distance 
between transmitter £ and receiver A; is \\Sk — 7Tf[|. We then define 

A kp = min ||<5 fe - ne\\. (21) 

If no user is allocated to the pth group, then we define Ao > to be a small default distance. Then 
Afcp = Ao- We can then modify Step 5 of the algorithm in Table IT] to be 

{k ,p } = argmaxA/^p. (22) 

k,p 



To group the closest users and perform IA, we can simply switch the max and min in (21 1 and (22 1. 



V. Optimizing Training Overhead with Partitioning 

To analyze the relationship between channel partitioning and overhead, we consider the physical layer 
overhead of training for channel estimation. While different interference alignment techniques have 
varying requirements for transmit CSI (and thus feedback overhead), they all require receive CSI for 
interference nulling. The obtainment of receiver CSI is typically performed through transmission of a 
known training sequence orthogonal to the data. In this section we find the optimal training lengths for 
a given partition, and the effect that partitioning has on training length. 

In general, channel estimation is done in the presence of noise, which means imperfect CSI at both 
receiver and transmitter. In this case, Q is no longer achievable. To approach this problem, we first 
assume perfect feedback of the imperfect channel estimates {H^}. Second, each receiver applies an 
interference-cancelling orthogonal filter U^ to its received signal y^, such that z& = U^y^, and s^ is 
estimated using ML detection from z&. Finally, we assume that the precoder design and receiver designs 
treat the channel state knowledge as perfect. 

Let 

Hfe,£ = Hk,e — Efc^, Vfc, £. (23) 

If the receivers use a minimum mean square error (MMSE) estimator, and the channel is being estimated 
in additive white Gaussian noise, then H^ and E& £ are uncorrected. The precoder F^ is based solely 
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Yk — \fPk,kH-k,k^k s k + 2_^ V 'Pk^kl^ 'l s i + v fc, (24) 

and the filtered signal, after interference filtering, is 

z/c = ^Pk,k^Jt^-k,kFkSk + A/PM-U^Efc^FfeSfc + U£ 2_^ \/Pk,e^k,eFiS£ + U^vfe, (25) 

since U£ X^fc Hfc/F^ = through interference alignment. If the error matrix Efe^ is drawn from a 
circularly symmetric complex Gaussian distribution, where each component is independent with variance 



<j| \/k, £, previous work [ 15 1 has found a lower bound on the sum rate using interference alignment with 



linear precoders when all the links have equal channel estimation error: 

K 



Ree> 



T-C(K,N r ,N t 



5> 

fc=i 



1 



<? E Sklk,k + 1 



1(1 + OElk,k + 5Z a Elk 

e^k 




/5fc,fcHfc 5 fcH 



k.k 



(26) 



Because of the homogenous assumption of channel estimation error, this formulation is useful when each 
receiver is roughly equidistant to each transmitter. In general, such an assumption may not be valid. 
Further, by characterizing the error variance in terms of the training length, it is possible to design the 
length of our training sequences to further trade overhead and rate. 

Expanding the analysis to include unequal error variances as a function of training length, we extend 



a previous model for point-to-point communications [12| to MIMO interference channels. The residual 



interference term in (25 1 is possibly non-Gaussian and dependent on the data we wish to decode |12j. 
We therefore find a lower-bound on the capacity of this system by examining the worst-case additive 
noise that is uncorrelated with the data. This noise model is tractable and has the same energy as the 



residual interference term. The analysis from 1 12 1 is directly applicable because we are making the same 
assumptions as Theorem 1 in that paper. Thus, the worst-case uncorrelated additive noise is spatially 
white, zero mean, circularly symmetric, and Gaussian with co variance matrix a\ I. Refer to the Appendix 
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of 1 12 1 for proof. Define Efej = UfEfe/F^. Assuming Es^ = I, W, and Es^Sfc = 0, W / fc, then 



(7. 
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trEn fc n^ 
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(27) 



£=i 



Define cr? = EtrH^ ^Ht , , and by the orthogonality principle, <r? =1 — 0"? . Finally, we normalize 



H fc 



H, 



each channel such that Hk.k = ii-k.k/<^% ■ The sum capacity with overhead is bounded from below by 



fe=l 



1 + 



Pfc,fc<7- 



H A 



Hz, t,H 



k,k"-k,k 



^K 






(28) 



where r is the number of transmissions per frame used for training, and C(K, N t , N r ) = C(K, N t , N r )—r 
is the number of transmissions per frame required for overhead other than training. 
Utilizing orthogonal training sequences from each transmit antenna, we find that 



4 



Pk,kT 
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(29) 
(30) 



The sum rate d28l) can thus be rewritten as 



R T > E 



E 



T - 


- T - 


-C{K,N U 


Nr) T** 


T - 


- T - 


T 
-C(K,N t , 


fc=l 






T 


/ , 1Q g 

^=1 



I + PcS,, 



1 + 



Hi, fcH 



k,k"-k,k 



Sk 







Pk,k T 






H fc,fc H fc,fc 


s k + 


Pk,kT 


+ EL 


Sk+Pk 
Se+Pk 


e r Pk/ S £ 


s k 



(31) 

.(32) 



The training length r can then be found by Monte Carlo methods to maximize the lower bound of (32i. 
At high SNR, 



PeS,k 



Pk,kT 



(33) 



T + Y!t=\ Se ' 
and the effect of the residual interferers is constant with respect to {pk,e}, meaning that there is no 
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reduction in the degrees of freedom region compared to the perfect CSI case. If {pk,i} is fixed, however, 
and K increases, then sum throughput is reduced. Thus, to maintain a sum rate increase with additional 
users, the signal power must increase with the addition of each user. Increasing the training length can 
also improve p e ff,fc! but is detrimental to the pre-log overhead factor. 

If the network is partitioned into P groups, each utilizing IA, then the rate is bounded from below by 
p 



R r >EY,[ T/p - T "-^ K " Nt - N ' ) )T,^ 



H fc ,fcH fcifc 

-l + PeS,k,p 



Sl 



(34) 



where 



Pk,k T P 



PeS,k,p — =-* S k +p k k r ' @^) 

S k + Pk,kT P + Lte^ P s e + Pk , e r P p Pk,i s e 
and T p is the number of training symbols used in group p. In this case, partitioning has the added potential 
to benefit the rate by reducing the total estimation error by reducing the number of channels to estimate. 
The rate bound of ( [34] ) allows an engineer to design the length of the training sequences as functions of 
the expected channel conditions. In summary, partitioning an interference channel can not only increase 
throughput by reducing overhead, but it can also increase the reliability of channel estimations. Further, 
the amount of overhead, a, can be optimized through training length minimization. 

VI. Simulations 
This section presents numerical results demonstrating the effect of overhead on the interference channel 



and comparing the greedy partitioning method of Section IV to previous approaches. The simulations are 
done using iterative interference alignment with linear precoding (7J, (8j with 100 iterations, although 
the analysis does not preclude utilization of other IA designs. As in (9j, five random initializations 
are used at each iteration, and the precoding design with best sum throughput among the different 
initializations is chosen as the design for that iteration. The degrees of freedom using this method has 
been conjectured to be d(K, N t , N r ) = (N t + N r )K/(K + 1) (24j and the number of streams are varied 
according to this relationship. Thus, if a group has K p users with M antennas, each transmitter will send 
d(K p , M)/K p streams. When d(K p , M) / K p is not an integer, some transmitters (chosen randomly) will 
transmit \d(K p , M)/K p ~\ streams and the rest will send [d(K p ,M)/K p \ streams such that the sum of 
streams in the network is d(K p ,M). Unless noted otherwise, channels are generated with independent 
and identically distributed (i.i.d.) zero-mean circularly symmetric complex Gaussian coefficients with unit 
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variance and p^/ = 20 dB, all k and £, ensuring that the network is fully connected as discussed in 
Section [IT] At low SNR values, interference alignment has been shown to perform poorly [7], thus the 
moderately high SNR environment is assumed. Generating the channels i.i.d. with a Gaussian distribution 
gives an idea of the best possible performance, since correlation has been shown to reduce IA rates [30J . 
Absolute values for coherence time and overhead are irrelevant, so the overhead percentage of the 
coherence time, a = C(K, N t ,N r )/T, or the data percentage of the coherence time assuming P = 1 
group, and a = (T — C(K, N t ,N r ))/T, are used. For TDMA, overhead is assumed to scale linearly 



with the number of users (X = K), while for IA, it scales with the square of the number of users 1 10 1 
OC = K 2 ). 

Figure [7] demonstrates how the optimal number of groups in a partition of a 6-user network varies with 
the coherence time of the channel. In this figure, P = 1 means IA over the entire network, while P = 6 
means TDMA over the entire network. Thus, P can be viewed as a complexity parameter that can vary 
transmission complexity from IA to TDMA with every combination thereof, as depicted in Figure |4] 
With low coherence times (or high overhead percentage since overhead is constant for variable T), IA 
over the entire network results in overhead consuming the entire frame. TDMA gives a non-zero sum 
rate but is still not optimal. Partitioning the network into 4 groups, 2 of which have one user while the 
rest have two users, results in the highest sum rate when overhead is considered. As the coherence time 
increases, however, IA gains start to outweigh the cost of overhead and thus a single-group partition, 
equivalent to not partitioning the network, is the best choice in terms of sum rate. 

Figure [8] shows the sum rate of the greedy partitioning method and the exhaustive partitioning method 
for K = 3 users for various a, with N t = N r = 2 antennas are at each node. For exhaustive partitioning, 
all possible values for P are considered and the actual sum rate with global channel knowledge Q 
is used. With a small coherence time, TDMA outperforms IA, whereas with a large coherence time, 
IA throughput gains outweigh the overhead cost of implementation, resulting in better sum rate than 
TDMA. The partitioning algorithms are able to dynamically vary the network transmission strategy as 
the coherence time changes. Further, the greedy partitioning method, approximates the optimal partitioning 
without a brute force search, with its worst performance at moderately low a due to the a priori choice 
of the number of groups based on degrees of freedom with overhead. For K = 6 users, partitioning 
leads to a larger sum rate increase at moderate SNR versus switching between IA and TDMA, as shown 
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in Figure [9] This is due to the increased number of possible partitions. Greedy partitioning is again able 
to adapt between the possible partitions as overhead is varied. Note that, although optimal search is not 
shown in this figure due to computational complexity, we know that since the greedy algorithm performs 
the partitioning based on large scale statistics, its throughput curve as a function of 1 — a is a piecewise 
linear function. The different segments of this function are points where a particular partition size is 
judged to be favorable when averaged over small scale fading effects. This is visible in Figures [8] [9] 



and 10 Thus, the greedy algorithm will be furthest from optimal in the switching regions, such as around 
1 — a « 0.5 in Figure [8] The gap between optimal and greedy will therefore grow with the number of 
possible partitions, and thus the number of users. 



Figure 10 demonstrates the gains of geographic grouping in a 6-cell network with user locations drawn 



uniformly from a circle with radius 758 m around each base station, which are placed 1.52 km apart. 



The channel model is the Type E model from IEEE 802. 16j |31|, and the base stations transmit with 
Nt = 3 transmit antennas and 40 dBm transmit power. When the partitioning algorithm chooses P > 1, 
grouping the users based on geographic distance outperforms the IA max-sum-rate algorithm because the 
IA gains are smaller in this operating region and are offset by the relatively high overhead of IA versus 
ignoring the interference. That is, users can be grouped to operate in a high SIR region, where ignoring 
interference is preferable to aligning it. More spatial streams can be exploited this way, utilizing less 
overhead because fewer channels must be estimated and fed back. At large coherence times IA is still the 
preferred strategy because the transmitters can utilize the entire frame after overhead for transmission. 
Finally, Figure 1 1 demonstrates the lower bound on the sum capacity from Section [V] as a function 



of the training length t for M = 10 antennas, K = 4,9, 19 users, p = 0, 10, 20 dB on all links, and 
coherence time T = 200 symbols. In this case, the optimal r does not significantly vary for different K, 
but increases from 18 to 42 symbols as p decreases from 20 dB to dB. 

VII. Conclusions 

This paper demonstrated the limitations of cooperative protocols for interference channels through 
overhead that scales faster than linearly with the number of users in the network. In particular, as the 
network grows, the sum rate with overhead of interference alignment goes to zero. By considering 
network overhead in the practical design for the interference channel, this paper has found analytical and 
algorithmic methods for trading off the overhead with the sum rate increase of cooperative transmission 
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strategies by partitioning the network into orthogonally transmitting groups. A suite of transmission 
designs spanning the simplicity of TDMA to the performance of IA can be chosen using the simple 
algorithms derived in this paper. The proposed algorithms attempt to maximize the sum rate with overhead 
with fair time sharing of the channel, fair sum rate between groups, or geographic grouping to exploit the 
reduced interference levels in unconnected channels. More work is required to characterize and reduce 
the overhead required for such strategies, particularly for obtaining CSI at the transmitters. 
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Fig. 1. The MIMO interference channel. Each transmitter is paired with a single receiver. In the model considered in this 
paper, the channels Hk,e are block fading with coherence time 2% /. 
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Fig. 2. Illustration of the communication frame used for the model in this paper. The beginning of the frame is used for overhead 
of any nature, consuming C(K,N t ,N r ) symbols. The remaining T — C(K,Nt,N r ) symbols are used for data transmission. 
New channels, independent of previous realizations, are generated at the end of the frame. 
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Fig. 3. Illustration of a partition of the A'-user interference channel into two A'/2-user interference channels transmitting 
orthogonally to each other. 



February 9, 2012 



DRAFT 



FIGURES 



20 



p=1 

IA 


P=2 


• • • 


P=K 
TDMA 



Increasing capacity and overhead 



Fig. 4. Illustration of the parameterized suite of transmission strategies partitioning provides. With P = 1, all A' users transmit 
simultaneously using interference alignment, which provides capacity gains at the cost of increased overhead. At P = K, the 
users transmit orthogonally in TDMA/FDMA fashion, with relatively low complexity and overhead, but also lower capacities. 
For 2 < P < K — 1, the interference channel is partitioned into smaller groups which transmit IA within the groups, but 
orthogonal to other groups. 
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Fig. 5. Illustration of the modifications needed to transform an IA-only system into a partitioning system. The partitioning 
function, if implemented in a greedy manner as explained in this section, requires no more communication to or from the 
transmitters as the IA-only system. 
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Fig. 6. Sum rate versus SNR of the approximation in dT3j for the 3-user MIMO interference channel with 4 antennas at each 
node and 2 streams per user. In this simulation and unless noted otherwise, SNR is the signal-to-noise ratio of all links in the 
interference channel, including interfering links. 
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Fig. 7. Sum rate versus number of groups for the 6-user MIMO interference channel with Nt — 3 and N r = 4. In this figure, 
P = 1 groups corresponds to IA over the entire network while P = 6 groups corresponds to TDMA. With large a, IA is not 
practical because overhead dominates the frame. As the coherence time increases (a decreases), however, P — 1 (i.e., applying 
IA over the network) is the sum-rate-optimal partition. 
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Fig. 8. Sum rate versus a for exhaustive search, greedy partitioning, IA, and TDMA. For this simulation, the users are kept at 
K — 3 and there are N t — N r — 2 antennas at each node, one stream is sent by each transmitter in groups utilizing IA, and 2 
streams are sent when a group consists of one node. The horizontal axis corresponds to the percentage of the coherence interval 
available for data transmission after overhead. At low coherence times, the overhead required for IA dominates its performance 
and utilizing TDMA results in a better sum rate. As the coherence time increases, IA gains begin to outweigh the overhead 
costs, and IA has a higher sum rate. 
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Fig. 9. Sum rate versus a for greedy partitioning, IA, and TDMA. For this simulation, the users are kept at K = 6 and there are 
Nt = 3 antennas at each transmitter and N r — 4 antennas at each receiver. As in Figure [8] the horizontal axis corresponds to the 
percentage of the coherence interval available for data transmission after overhead. A larger gain is available when partitioning 
with more users relative to the 3 users of Figure [5] 
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Fig. 10. Sum rate versus a for greedy partitioning with geographic grouping, IA-sum rate grouping, IA, and TDMA. For this 
simulation, a cellular channel model is used for a 6-cell arrangement. 
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Fig. 11. Sum rate lower bound versus r for K = 4,9, 19 users, M — 10 antennas, p — 0, 10,20 dB, and coherence time 
T = 200 symbols. Optimal r values for p = 0, 10, 20 dB are 18, 26, and 42 symbols, respectively. 
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1. 


Find Ko according to ( 


15 


) 




2. 


p- r-^i 

r \ K \ 




3. 


Set K^ = {1, • ■ • , K} and K p = for p G {1, . 


;P} 


4. 


Find i?fc iP for /c G Ka and p G {1, . . . , P} 




5. 


Let {k',p'} = arg max fciP i? fc ,p 




6. 


Add k' to the set 1C P > and remove from AC a 




7. 


If Ka t^ 0, return to 4; else done 





TABLE I 

Greedy algorithm based on IA rate and group size approximations. 
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